feat(container-loader): add captureFullContainerState free function#27220
feat(container-loader): add captureFullContainerState free function#27220markfields wants to merge 26 commits intomicrosoft:mainfrom
Conversation
Adds a driver-only free function that captures a container's current state in the IPendingContainerState wire format using only an IDocumentServiceFactory and IUrlResolver. Unlike Container.getPendingLocalState(), no runtime or codeLoader is instantiated: the function fetches the latest snapshot, reads the authoritative sequence number from the snapshot's attributes blob, drains ops from delta storage from that sequence number, and serializes the result. pendingRuntimeState is undefined, so the output is intended for state relay, inspection, and durable-state snapshot use cases rather than rehydrating in-flight DDS changes. The output can be fed back into loadExistingContainer or loadFrozenContainerFromPendingState as pendingLocalState. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tainerPendingState Extends captureContainerPendingState so the driver-level state is fully self-contained for blob reads as well. Attachment blob bytes are fetched and added to snapshotBlobs keyed by storage ID, which ContainerStorageAdapter already serves through its cache — no wire-format change required. GC state is consulted when present: blobs GC has explicitly marked unreferencedTimestampMs, tombstoned, or deleted are skipped. Blobs absent from the GC graph are kept, since GC lag can leave recently-attached blobs off the graph and dropping them would lose live data. When the snapshot has no GC tree (GC disabled or pre-GC document), every attachment blob from the BlobManager redirect table is included. The relevant blob manager / GC constants and the minimal parsing logic are duplicated locally to avoid a loader → runtime dependency; comments point back to the canonical definitions in container-runtime and runtime-definitions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ware
Extends the driver-only capture to cover the whole referenced graph of the
container, not just attachment blobs.
- Honours ISnapshotTree.unreferenced: a shared tree walker skips any subtree
flagged unreferenced by the summarizer (which sets the flag from GC state)
and inlines contents of every other blob it reaches. Replaces the
unfiltered getBlobContentsFromTree path.
- Pre-fetches loading-group snapshots: enumerates groupIds on the base
snapshot (skipping unreferenced subtrees), fetches each via
IDocumentStorageService.getSnapshot({ versionId, loadingGroupIds }), runs
the fetched snapshot through the same tree walker, and serialises the
result into IPendingContainerState.loadedGroupIdSnapshots. If the driver
lacks getSnapshot support or no groupIds are declared, no groups are
included.
- GC parsing is done once and shared between tree-level and attachment-blob
filtering. captureReferencedAttachmentBlobs now takes pre-parsed GC data.
Renames captureAttachmentBlobs.ts to captureReferencedContents.ts to reflect
the broader scope.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers the reachability filtering and groupId-fetch paths that the local end-to-end tests can't easily exercise (no summarizer runs in the local test server, and TestFluidObjectFactory doesn't produce loading-group datastores out of the box). Tests construct ISnapshotTree fixtures directly and back readBlob/getSnapshot with an in-memory shim. 15 cases across readReferencedSnapshotBlobs, parseGcSnapshotData, captureReferencedAttachmentBlobs, and captureGroupIdSnapshots — covering unreferenced subtree skip, root .blobs special-casing, ISnapshot vs ISnapshotTree input, GC lag tolerance (blobs absent from gcNodes are kept), tombstone/deletedNodes skip, and groupId enumeration/dedup/fetch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ptureFullContainerState The function captures the whole referenced graph of the container (snapshot, loading-group snapshots, inlined structural blobs, inlined attachment blobs, trailing ops) — not just "pending state." The new name matches the scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e.spec Collapse a multi-line chained .get() call onto a single line to satisfy biome's formatter — CI was failing on `biome check .` in local-server-tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…inerState Four issues flagged by Copilot review on PR microsoft#27100: 1. captureFullContainerState created an IDocumentService but never called dispose(). Wrap the capture body in try/finally and dispose in the finally to release driver-held resources (sockets/caches). 2. readReferencedSnapshotBlobs fanned every blob read at every tree level into a single Promise.all, giving unbounded concurrency on large snapshots. Refactor into a collect-then-fetch pipeline: walk the tree synchronously to gather referenced blob ids, then fetch via a new mapWithConcurrency helper capped at 32 in-flight reads. 3. captureReferencedAttachmentBlobs had the same unbounded-parallel issue over all referenced attachment storage ids. Route through the same mapWithConcurrency helper. 4. collectUnreferencedBlobLocalIds returned undefined when gcData.gcState was undefined, silently dropping tombstones/deletedNodes filtering even when those lists were populated. Contradicted the function docs. Now always applies tombstones/deletedNodes regardless of gcState presence, and returns a (possibly empty) Set rather than undefined. Added a unit test covering the gcState-undefined-but-tombstones-present case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Follow-up to 8a748cc: captureGroupIdSnapshots still fanned every getSnapshot call into a single Promise.all. Route through mapWithConcurrency with a lower limit (4) since each call pulls a whole snapshot tree, not a single blob. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ip cache Two issues from PR review: 1. Bind getSnapshot when extracting it from the storage service. Real driver implementations reference `this` (e.g., LocalDocumentStorageService.getSnapshot reads this.id), so calling the detached method would TypeError in strict mode. Mirrors the bind pattern in protocolTreeDocumentStorageService.ts:31. Added a class-based unit test stub whose getSnapshot touches `this` — would have caught this. 2. Pass cacheSnapshot: false on every getSnapshot call we make from the capture path. This capture is transient; we don't want to pollute the driver's snapshot cache with it. Covered by a unit test asserting the option is forwarded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Hi! Thank you for opening this PR. Want me to review it? Based on the diff (1681 lines, 14 files), I've queued these reviewers:
How this works
|
- Wire-format consts POJO + contract test - GC-interesting test - Monitoring context wired (to be reverted) - API report regenerated
There was a problem hiding this comment.
Pull request overview
This PR adds a new driver-only capture API to @fluidframework/container-loader that can produce a portable IPendingContainerState JSON for an attached document without creating a Loader/Container/Runtime, enabling fully-offline frozen-container rehydration scenarios.
Changes:
- Adds
captureFullContainerState(@legacy @alpha) to capture the latest snapshot + post-snapshot ops, inline referenced snapshot blobs, and inline referenced attachment blob bytes (base64) for portability. - Extends the pending-state wire format with
attachmentBlobContents(base64) and wires decoding/deduping through load (SerializedStateManager) and storage (PendingLocalStateStore). - Adds unit + local-server integration coverage, plus a contract test to detect drift in duplicated wire-format constants.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/loader/container-loader/src/createAndLoadContainerUtils.ts | Implements captureFullContainerState and its props interface; snapshot/op capture and assembly of IPendingContainerState. |
| packages/loader/container-loader/src/captureReferencedContents.ts | New GC-aware snapshot walker + attachment-blob capture helpers and exported wireFormatConstants. |
| packages/loader/container-loader/src/serializedStateManager.ts | Adds attachmentBlobContents to IPendingContainerState and decodes/merges it into the blob cache on load. |
| packages/loader/container-loader/src/pendingLocalStateStore.ts | Dedupes the new attachmentBlobContents map across stored pending states. |
| packages/loader/container-loader/src/containerStorageAdapter.ts | Introduces IBase64BlobContents type to make the base64-vs-utf8 encoding contract explicit. |
| packages/loader/container-loader/src/index.ts | Exports captureFullContainerState, ICaptureFullContainerStateProps, and wireFormatConstants (internal). |
| packages/loader/container-loader/src/test/captureReferencedContents.spec.ts | Unit tests for GC parsing/filtering, referenced-blob walking, attachment-blob behavior, and loading-group detection. |
| packages/test/local-server-tests/src/test/captureFullContainerState.spec.ts | Local-server integration tests for capture → frozen rehydrate, ops after snapshot, nested DDS handles, and binary attachment blobs. |
| packages/test/local-server-tests/src/test/wireFormatConstants.spec.ts | Contract test ensuring loader-duplicated wire-format constants match runtime/runtime-definitions sources. |
| packages/runtime/container-runtime/src/index.ts | Re-exports internal blob-manager wire-format constants for the contract test. |
| packages/runtime/container-runtime/src/blobManager/blobManager.ts | Marks blobManagerBasePath as @internal for extraction/export hygiene. |
| packages/runtime/container-runtime/src/blobManager/blobManagerSnapSum.ts | Marks redirectTableBlobName as @internal for extraction/export hygiene. |
| packages/loader/container-loader/api-report/container-loader.legacy.alpha.api.md | API report update for the new @legacy @alpha export and props interface. |
| .changeset/wide-foxes-behave.md | Changeset for the new API and related internal exports. |
| full-container-state-review-notes.md | Adds detailed review notes / design and coverage tracking document. |
|
🔗 No broken links found! ✅ Your attention to detail is admirable. linkcheck output |
Deep ReviewReviewed commit Readiness: 7/10 — 🟢 ALMOST READY Not ready for sign-off, but close. The prior round's two open Tier 2s are resolved on the code axis: the wire-format extension is now called out in Path to Ready
Context for Reviewers
For human reviewer
Review history (5 prior reviews)
|
| return { tree: snapshot, read: async (id) => storage.readBlob(id) }; | ||
| } | ||
|
|
||
| /** |
There was a problem hiding this comment.
Deep Review: UTF-8/base64 encoding split — process sign-off ask, no code change.
captureReferencedAttachmentBlobs writes base64 (bufferToString(buffer, "base64")) into a new IBase64BlobContents; readReferencedSnapshotBlobs writes UTF-8 into ISerializableBlobContents (serializedStateManager.ts:284-299, decode in containerStorageAdapter.ts:515-527). The split is technically correct and aligns with the runtime's existing pending-blob serializer (blobManager.ts:340-343, :951-960); the non-UTF-8 round-trip e2e test at captureFullContainerState.spec.ts:351-427 covers the binary case.
Process concern: vladsud's "we should stick to one format across layers for blobs" position from PR #20507 (with anthony-murphy explicitly looped in) hasn't been revisited on the record. vladsud is already requested; please also tag anthony-murphy so the area's "one format" stance is consciously superseded rather than silently bypassed. The binary-safety argument for the split is sound — no code change expected.
| * Optional logger for driver-side telemetry. | ||
| */ | ||
| readonly logger?: ITelemetryBaseLogger | undefined; | ||
| } |
There was a problem hiding this comment.
Deep Review: Add abortSignal?: AbortSignal to ICaptureFullContainerStateProps before @alpha graduation.
ICaptureFullContainerStateProps exposes only urlResolver, documentServiceFactory, request, logger. The capture body concurrently reads snapshot/attachment blobs (bounded concurrency 32) and drains the op stream without a cancellation check. IDocumentDeltaStorageService.fetchMessages already accepts abortSignal?: AbortSignal (packages/common/driver-definitions/src/storage.ts:103-109), so the underlying primitives support it.
Suggested fix: add abortSignal?: AbortSignal to ICaptureFullContainerStateProps, propagate it into the mapWithConcurrency blob/group loops and the fetchMessages op-drain loop, and call signal.throwIfAborted() between bounded batches. Additive and cheaper to land before the API graduates from @alpha.
There was a problem hiding this comment.
This is a good idea, adding it to follow-ups mentioned in PR description
| attachmentBlobContents, | ||
| ); | ||
| Object.assign(attachmentBlobContents, added); | ||
| } |
There was a problem hiding this comment.
Deep Review: Op-stream drain trusts the driver silently — assert contiguity to fail closed.
captureFullContainerState drains connectToDeltaStorage().fetchMessages(attributes.sequenceNumber + 1, …) into savedOps and emits whatever the driver yields. Other parts of the same function fail-closed on suspicious states (UsageError on loading groups; GenericError on empty snapshot fetch); ops are the one place that trusts the driver silently. A buggy driver / network truncation would produce an artifact whose ops cannot be replayed, surfacing as a confusing rehydration-time failure pointing at the loader rather than at the capture.
Suggested fix: track a previous-sequence-number across the drain loop and assert each op's sequenceNumber === prev + 1; throw on a gap with the offending pair in the message. Optional: also assert the snapshot's latestSequenceNumber >= attributes.sequenceNumber at fetch time.
There was a problem hiding this comment.
Another good point. A downside to not using the DeltaManager code in the container loader, I guess. Tracking as a follow-up.
| gcBlobPrefix: "__gc", | ||
| gcTombstoneBlobKey: "__tombstones", | ||
| gcDeletedBlobKey: "__deletedNodes", | ||
| } as const; |
There was a problem hiding this comment.
Deep Review: Loader↔runtime constant duplication — gatekeeper nod ask, no code change.
This PR introduces a wireFormatConstants block in the loader mirroring seven runtime-side constants (.blobs, .redirectTable, _blobs, gc, __gc, __tombstones, __deletedNodes) and a cross-package deepStrictEqual contract test (wireFormatConstants.spec.ts). PR #21729 (anthony-murphy, 2024-07-02) was the deliberate refactor that produced these constants in their current form, with markfields reviewing for path-format discipline. This PR is the first cross-layer consumer; the chosen pattern (duplicate + contract test) sets precedent for any future shared wire-format constant.
Tag anthony-murphy for an explicit nod on the duplicate+contract-test layering choice, or merge with a clear note that AB#72934 will revisit (per thread #3192306888). No code change expected.
| urlResolver, | ||
| documentServiceFactory, | ||
| request, |
There was a problem hiding this comment.
@anthony-murphy thinking about replacing these 3 params with just documentService and having caller piece that together (maybe in follow-up PR). Does that work? I don't totally understand the environment in which this will be called and where these params will come from
There was a problem hiding this comment.
its tricky. overall i would love to move in a direction where fluid takes a document service, but it could be hard right now due lack of usage that way. there won't be any app examples, and i'm not sure the odsp driver supports it. i would be 100% down to support both maybe via overload or union to start moving in that direction, as we need to start somewhere
Adds
captureFullContainerState, a@legacy @alphafree function incontainer-loaderthatproduces an
IPendingContainerStateJSON for an attached document without instantiating aLoader, Container, or runtime. It drives only
IUrlResolver+IDocumentServiceFactory,fetches the latest snapshot via
getSnapshot/getSnapshotTree, drains ops viafetchMessages(seq+1, …), and inlines blob contents and loading-group snapshots so theartifact is fully portable.
This is the missing piece in the frozen-container series:
asLegacyAlpha/ContainerAlphasurface)loadFrozenContainerFromPendingState)createFrozenDocumentServiceFactory)Together with
loadFrozenContainerFromPendingState,captureFullContainerStateenablesfully-offline frozen-container scenarios.
Design notes
unreferenced: truesubtrees, and theattachment-blob filter parses GC tombstones / deleted-nodes with documented GC-lag
tolerance (blobs absent from the GC graph are kept).
mapWithConcurrency: 32 blobs, 4 group snapshots).captureFullContainerStatethrows
UsageErrorif any referenced subtree carries agroupId. This will be reintroduced whenthere's a known consumer and e2e harness.
container-runtime. Across-package contract test fails CI on drift.
blobs since those are text-based payloads.
See also #27100, and this markdown file used during branch development,
Follow-ups
See this list of follow-ups accrued during development. These will be moved to ADO tasks once the PR is merged.
Additionally, these from Deep Review on a later iteration:
And some of my own thoughts on final review pass:
loadExistingContainerandloadFrozenContainerFromPendingState?